Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
K-means clustering based on adaptive cuckoo optimization feature selection
Lin SUN, Menghan LIU
Journal of Computer Applications    2024, 44 (3): 831-841.   DOI: 10.11772/j.issn.1001-9081.2023030351
Abstract131)   HTML7)    PDF (2193KB)(112)       Save

The initial cluster number of the K-means clustering algorithm is randomly determined, a large number of redundant features are contained in the original datasets, which will lead to the decrease of clustering accuracy, and Cuckoo Search (CS) algorithm has the disadvantages of low convergence speed and weak local search. To address these issues, a K-means clustering algorithm combined with Dynamic CS Feature Selection (DCFSK) was proposed. Firstly, an adaptive step size factor was designed during the Levy flight phase to improve the search speed and accuracy of the CS algorithm. Then, to adjust the balance between global search and local search, and accelerate the convergence of the CS algorithm, the discovery probability was dynamically adjusted. An Improved Dynamic CS algorithm (IDCS) was constructed, and then a Dynamic CS-based Feature Selection algorithm (DCFS) was built. Secondly, to improve the calculation accuracy of the traditional Euclidean distance, a weighted Euclidean distance was designed to simultaneously consider the contribution of samples and features to distance calculation. To determine the selection scheme of the optimal number of clusters, the weighted intra-cluster and inter-cluster distances were constructed based on the improved weighted Euclidean distance. Finally, to overcome the defect that the objective function of the traditional K-means clustering only considers the distance within the clusters and does not consider the distance between the clusters, a objective function based on the contour coefficient of median was proposed. Thus, a K-means clustering algorithm based on the adaptive cuckoo optimization feature selection was designed. Experimental results show that, on ten benchmark test functions, IDCS achieves the best metrics. Compared to algorithms such as K-means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), DCFSK achieves the best clustering effects on six synthetic datasets and six UCI datasets.

Table and Figures | Reference | Related Articles | Metrics
Feature selection for imbalanced data based on neighborhood tolerance mutual information and whale optimization algorithm
Lin SUN, Jinxu HUANG, Jiucheng XU
Journal of Computer Applications    2023, 43 (6): 1842-1854.   DOI: 10.11772/j.issn.1001-9081.2022050691
Abstract192)   HTML6)    PDF (1713KB)(208)       Save

Aiming at the problems that most feature selection algorithms do not fully consider class non-uniform distribution of data, the correlation between features and the influence of different parameters on the feature selection results, a feature selection method for imbalanced data based on neighborhood tolerance mutual information and Whale Optimization Algorithm (WOA) was proposed. Firstly, for the binary and multi-class datasets in incomplete neighborhood decision system, two kinds of feature importances of imbalanced data were defined on the basis of the upper and lower boundary regions. Then, to fully reflect the decision-making ability of features and the correlation between features, the neighborhood tolerance mutual information was developed. Finally, by integrating the feature importance of imbalanced data and the neighborhood tolerance mutual information, a Feature Selection for Imbalanced Data based on Neighborhood tolerance mutual information (FSIDN) algorithm was designed, where the optimal parameters of feature selection algorithm were obtained by using WOA, and the nonlinear convergence factor and adaptive inertia weight were introduced to improve WOA and avoid WOA from falling into the local optimum. Experiments were conducted on 8 benchmark functions, the results show that the improved WOA has good optimization performance; and the experimental results of feature selection on 13 binary and 4 multi-class imbalanced datasets show that the proposed algorithm can effectively select the feature subsets with good classification effect compared with the other related algorithms.

Table and Figures | Reference | Related Articles | Metrics
Multilabel feature selection algorithm based on Fisher score and fuzzy neighborhood entropy
Lin SUN, Tianjiao MA, Zhan’ao XUE
Journal of Computer Applications    2023, 43 (12): 3779-3789.   DOI: 10.11772/j.issn.1001-9081.2022121841
Abstract173)   HTML2)    PDF (1222KB)(91)       Save

For that Fisher score model does not fully consider feature-label and label-label relations, and some neighborhood rough set models easily neglect the uncertainty of knowledge granulations in the boundary region, resulting in the low classification performance of these algorithms, a MultiLabel feature selection algorithm based on Fisher Score and Fuzzy neighborhood entropy (MLFSF) was proposed. Firstly, by using the Maximum Information Coefficient (MIC) to evaluate the feature-label association degree, the relationship matrix between features and labels was constructed, and the correlation between labels was analyzed by the relationship matrix of labels based on the adjusted cosine similarity. Secondly, a second-order strategy was given to obtain multiple second-order label relationship groups to reclassify the multilabel domain, where the strong correlation between labels was enhanced and the weak correlation between labels was weakened to obtain the score of each feature. The Fisher score model was improved to preprocess the multilabel data. Thirdly, the multilabel classification margin was introduced to define the adaptive neighborhood radius and neighborhood class, and the upper and lower approximation sets were constructed. On this basis, the multilabel rough membership degree function was presented, and the multilabel neighborhood rough set was mapped to the fuzzy set. Based on the multilabel fuzzy neighborhood, the upper and lower approximation sets and the multilabel fuzzy neighborhood rough set model were developed. Thus, the fuzzy neighborhood entropy and the multilabel fuzzy neighborhood entropy were defined to effectively measure the uncertainty of the boundary region. Finally, the Multilabel Fisher Score-based feature selection algorithm with second-order Label Correlation (MFSLC) was designed, and then the MLFSF was constructed. The experimental results applied to 11 multilabel datasets with the Multi-Label K-Nearest Neighbor (MLKNN) classifier show that when compared with six state-of-the-art algorithms including the Multilabel Feature Selection algorithm based on improved ReliefF (MFSR), MLFSF improves the mean of Average Precision (AP) by 2.47 to 6.66 percentage points; meanwhile, MLFSF obtains optimal values for all five evaluation metrics on most datasets.

Table and Figures | Reference | Related Articles | Metrics
Feature selection algorithm based on neighborhood rough set and monarch butterfly optimization
Lin SUN, Jing ZHAO, Jiucheng XU, Xinya WANG
Journal of Computer Applications    2022, 42 (5): 1355-1366.   DOI: 10.11772/j.issn.1001-9081.2021030497
Abstract288)   HTML9)    PDF (1375KB)(84)       Save

The classical Monarch Butterfly Optimization (MBO) algorithm cannot handle continuous data well, and the rough set model cannot sufficiently process large-scale, high-dimensional and complex data. To address these problems, a new feature selection algorithm based on Neighborhood Rough Set (NRS) and MBO was proposed. Firstly, local disturbance, group division strategy and MBO algorithm were combined, and a transmission mechanism was constructed to form a Binary MBO (BMBO) algorithm. Secondly, the mutation operator was introduced to enhance the exploration ability of this algorithm, and a BMBO based on Mutation operator (BMBOM) algorithm was proposed. Then, a fitness function was developed based on the neighborhood dependence degree in NRS, and the fitness values of the initialized feature subsets were evaluated and sorted. Finally, the BMBOM algorithm was used to search the optimal feature subset through continuous iterations, and a meta-heuristic feature selection algorithm was designed. The optimization performance of the BMBOM algorithm was evaluated on benchmark functions, and the classification performance of the proposed feature selection algorithm was evaluated on UCI datasets. Experimental results show that, the proposed BMBOM algorithm is significantly better than MBO and Particle Swarm Optimization (PSO) algorithms in terms of the optimal value, worst value, average value and standard deviation on five benchmark functions. Compared with the optimized feature selection algorithms based on rough set, the feature selection algorithms combining rough set and optimization algorithms, the feature selection algorithms combining NRS and optimization algorithms, the feature selection algorithms based on binary grey wolf optimization, the proposed feature selection algorithm performs well in the three indicators of classification accuracy, the number of selected features and fitness value on UCI datasets, and can select the optimal feature subset with few features and high classification accuracy.

Table and Figures | Reference | Related Articles | Metrics
Drowsiness recognition algorithm based on human eye state
Lin SUN, Yubo YUAN
Journal of Computer Applications    2021, 41 (11): 3213-3218.   DOI: 10.11772/j.issn.1001-9081.2020122058
Abstract515)   HTML14)    PDF (1688KB)(357)       Save

Most of the existing drowsiness recognition algorithms are based on machine learning or deep learning, without considering the relationship between the sequence of human eye closed state and drowsiness. In order to solve the problem, a drowsiness recognition algorithm based on human eye state was proposed. Firstly, a human eye segmentation and area calculation model was proposed. Based on 68 feature points of the face, the eye area was segmented according to the extremely large polygon formed by the feature points of human eye, and the total number of eye pixels was used to represent the size of the eye area. Secondly, the area of the human eye in the maximum state was calculated, and the key frame selection algorithm was used to select 4 frames representing the eye opening state the most, and the eye opening threshold was calculated based on the areas of human eye in these 4 frames and in the maximum state. Therefore, the eye closure degree score model was constructed to determine the closed state of the human eye. Finally, according the eye closure degree score sequence of the input video, a drowsiness recognition model was constructed based on continuous multi-frame sequence analysis. The drowsiness state recognition was conducted on the two commonly used international datasets such as Yawning Detection Dataset (YawDD) and NTHU-DDD dataset.Experimental results show that, the recognition accuracy of the proposed algorithm is more than 80% on the two datasets, especially on the YawDD, the proposed algorithm has the recognition accuracy above 94%. The proposed algorithm can be applied to driver status detection during driving, learner status analysis in class and so on.

Table and Figures | Reference | Related Articles | Metrics
CUDA based parallel implementation of simultaneous algebraic reconstruction technique
SHI Huai-lin SUN Feng-rong JIANG Wei LIU Wei QIN Tong LI Xin-cai
Journal of Computer Applications    2011, 31 (05): 1245-1248.   DOI: 10.3724/SP.J.1087.2011.01245
Abstract1517)      PDF (620KB)(1001)       Save
Simultaneous Algebraic Reconstruction Technique (SART) is able to generate Computed Tomography (CT) images with higher quality compared to Filtered Back-Projection (FBP) method when the projection data is incomplete or noisy. However, it is very time-consuming; and parallel computation is one of those efficient approaches to manage the problem. In this study, a new parallel implementation of SART based on the platform of Compute Unified Device Architecture (CUDA) was proposed. The experimental results show that there are no differences between the images reconstructed by this new method and those by serial implementation, but the reconstruction time is greatly decreased, more applicable to clinical application.
Related Articles | Metrics
Implementation of creative conception design system based on fractal
Yu-Lin SUN Hong LIU Xiao-Hui WANG
Journal of Computer Applications   
Abstract1499)      PDF (915KB)(880)       Save
An approach to implement creative conception design system by using fractal was presented to improve the initial conception design process that used math function in the past. An example of architectural modeling creative design was given: swarms were initialized by fractal operations such as mergers and so on, and then calculated by evolution algorithm. Experiments indicate that the fractal approach is promising to develop creative conception design system.
Related Articles | Metrics